On the study of nearest neighbor algorithms for prevalence estimation in binary problems

نویسندگان

  • José Barranquero
  • Pablo González
  • Jorge Díez
  • Juan José del Coz
چکیده

This paper presents a new approach for solving binary quantification problems based on nearest neighbor (NN) algorithms. Our main objective is to study the behavior of these methods in the context of prevalence estimation. We seek for NN-based quantifiers able to provide competitive performance while balancing simplicity and effectiveness. We propose two simple weighting strategies, PWK and PWK, which stand out among state-of-the-art quantifiers. These proposed methods are the only ones that offer statistical differences with respect to less robust algorithms, like CC or AC. The second contribution of the paper is to introduce a new experiment methodology for quantification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms

A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

Facial Expression Recognition Based on Structural Changes in Facial Skin

Facial expressions are the most powerful and direct means of presenting human emotions and feelings and offer a window into a persons’ state of mind. In recent years, the study of facial expression and recognition has gained prominence; as industry and services are keen on expanding on the potential advantages of facial recognition technology. As machine vision and artificial intelligence advan...

متن کامل

Estimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest

    Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...

متن کامل

Modelling Climatic Parameters Affecting the Annual Yield of Rheum Ribes Rangeland Species using Data Mining Algorithms

Identification of climatic characteristics affecting the annual yield of Rheum Ribes can be useful in management and development of this species in the rangelands. In this research, the annual yield of this species in Khorasan-Razavi province based on 74 climatic parameters during a ten-year period evaluated and affecting climatic parameters extracted using data mining methods. First, the role ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 46  شماره 

صفحات  -

تاریخ انتشار 2013